Application No. 1 0/50 1 ,502 Docket No. : 0020-5278PUS 1 

Amendment dated June 3, 2008 

After Final Office Action of February 4, 2008 

REMARKS 

Reconsideration and allowance of the subject application are respectfully requested. 
Applicant thanks the Examiner for total consideration given the present application. Claims 1-6 
and 8 are pending prior to the Office Action. No claims have been added through this reply. 
Claim 1 has been canceled without prejudice or disclaimer of the subject matter included 
therein. Therefore, claims 2-6 and 8 are pending. Claims 2, 6, and 8 are independent. 
Applicant respectfully requests reconsideration of the rejected claims in light of the remarks 
presented herein, and earnestly seeks timely allowance of all pending claims. 

OFFICIAL ACTION 

Preliminary Comments 

Corrected Claim Set 

In the Reply, dated November 15, 2007, the set of claims 1-6 were incorrectly submitted 
which were the initial claim set 1-6 of this application. The current version of claims 1-6 are 
now correct and represent the claims the Examiner responded to in the Office Action dated 
February 4, 2008 and also represent claims the Applicant submitted arguments against in the 
Reply dated November 1 5, 2007. 

Claim Amendment 

Even thought the Applicant does not agree with the rejection of independent claim 1, the 
Applicant has amended claim 2 to include all features of claim 1 in order to move prosecution 
forward. 

Claim Rejection - 35 U.S.C. § 102(e) / Claim Rejection - 35 U.S.C. § 103(a) 

Claims 1, 4-6 and 8 stand rejected under 35 U.S.C. § 102(e) as being allegedly 
anticipated over Stevens et al. (U.S. Patent Publication No. 2002/0138265). Claims 2 and 3 
stand rejected under 35 U.S.C. § 103(a) as being allegedly unpatentable over Stevens et al. (U.S. 
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Patent Publication No. 2002/0138265) in view of Chen et al. (U.S. Patent 6,006,186). The 
Applicant respectfully traverses this rejection. 

For a Section 102 rejection to be proper, the cited reference must teach or suggest each 
and every claimed element. See M.P.E.P. 2131; M.P.E.P. 706.02. Thus, if the cited reference 
fails to teach or suggest one or more elements, then the rejection is improper and must be 
withdrawn. For a Section 103 rejection to be proper, a prima facie case of obviousness must be 
established. See M.P.E.P. 2142. One requirement to establish a prima facie case of obviousness 
is that the prior art references, when combined, must teach or suggest all claim limitations. See 
M.P.E.P. 2142; M.P.E.P. 706.02Q). Thus, if the cited references fail to teach or suggest one or 
more elements, then the rejection is improper and must be withdrawn. 

Argument A) Features of claims 2, 6 and 8 not taught by Stevens: 

Independent claim 2 recites, inter alia, "a context dependent acoustic model storage unit in 
which the context dependent acoustic models are stored in a form of sub-word state trees in each 
of which state sequences of a plurality of sub-word models of the context dependent acoustic 
models are organized in a tree structure . " Emphasis added. Applicant respectfully traverses 
this rejection for the following reasons: 

The Examiner states by representing acoustic models by phonemes, wherein each 
phoneme may be represented as a triphone that includes multiple nodes implies using a content 
dependent acoustic models in a form of sub-word state trees since, each phoneme of the acoustic 
models contains multiple nodes that represent a tree structure (Final Office Action, page 2, 
section 1). The Examiner is incorrect in what Stevens does and does not imply. 

Stevens discloses "[e]ach phoneme may be represented as a triphone that includes 
multiple nodes" (paragraph 75). Stevens additionally discloses "[a] triphone is content- 
dependent phoneme" (paragraph 75). Also, Stevens provides an example of a triphone as 'abc\ 
Thus, Stevens does imply : a triphone as an elementary linguistic unit that represents a sequence 
of three phonemes as shown in the example, and as which is known to be the definition of a 
triphone. For example, Stevens does not imply : context dependent acoustic models are stored in 
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a form of sub-word state trees in each of which state sequences of a plurality of sub-word models 
of the context dependent acoustic models are organized in a tree structure . 

In the final Office Action, in response to the arguments that Stevens does not disclose a 
context dependent acoustic storage unit storing context dependent acoustic models in a form of 
subword state trees, the Examiner responds stating that "Stevens et al., teach that the acoustic 
models represent phonemes,.... By representing acoustic models by phonemes, wherein each 
phoneme may be represented as a triphone that includes multiple nodes implies using a context 
dependent acoustic model storage unit storing context dependent acoustic models in a form of 
sub-word state trees, wince each phoneme of the acoustic models contain multiple nodes that 
represent a tree structure." The Applicant respectfully disagrees with the Examiner's response. 

To better illustrate differences between the claimed invention and Stevens, and in order 
for the Examiner to clearly understand those differences, the Applicant has prepared and 
included Sheet No. 1 and Sheet No. 2. 

The attached Sheet No. 1 illustrates an example of a grammar network to be stored in the 
claimed language model storage unit, examples of conventional acoustic models 1 and 2, and 
acoustic models according to the present invention to be stored in the claimed context dependent 
acoustic model storage unit in (a), (b), (c), and (d) respectively. In the illustrations, circles 
represent nodes and arrowed lines between the nodes of triphones represent time intervals as 
short as, for example, 10 milliseconds. 

In (b) and (c), the tripohones have multiple nodes, but do not represent a tree structure. 
The tree structure needs a single top node. The structure shown in (b) corresponds to the one 
shown in Fig. 2B of the present application. Stevens does not disclose how and in what way the 
triphones are stored. Thus, Stevens appears to use the conventional acoustic models as shown in 
(b) or (c) to be stored in the active vocabulary 230. 

Sheet No. 2 represents a comparative example of acoustic models which is not 
conventional but shows the acoustic models obtained by simply dividing the acoustic model 2 
shown in (c) such that triphones having the same center phonemes and the same preceding 
phonemes are gathered. The comparative example, however, does not represent a tree structure, 
which is required to have a single top node (root node) as explained above. 
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In sum, Stevens merely discloses a sequence of three phonemes (i.e. 'abc') and Stevens does 
not disclose the context dependent acoustic models are stored in a form of sub-word state trees in 
each of which state sequences of a plurality of sub-word models of the context dependent acoustic 
models ar e organized in a tree structure . A sequence does not disclose or imply a tree structure. 

Independent claims 6 and 8 are allowable for similar reasons as set forth above in 
reference to independent claim 2. 

Argument B) Features of claims 2, 6, and 8 not taught by Stevens: 

Independent claim 2 recites, inter alia, "a matching unit developing hypotheses of sub- 
words by referencing the sub-word state tree representing the context dependent acoustic models, 
the word lexicon and the language models, and performing matching between feature parameters of 
input speech and the developed hypotheses so as to output word information including a word, an 
accumulated score and a beginning start frame with respect to a hypothesis representing a word end 
portion Emphasis added. Applicant respectfully traverses this rejection for the following 
reasons: 

The above portion of claim 2 is formatted below for better understanding. The above 
portion of claim 2 recites: 

a matching unit 

A) developing hypotheses of sub-words by referencing: 

Al) the sub-word state tree representing the context dependent acoustic models, 

A2) the word lexicon and 

A3) the language models, 
and (the matching unit also) 

BA) performing matching between 

BA1) feature parameters of input speech and 

BA2) the developed hypotheses (which is A: Al, A2, and A3) 

so as to output 

BB) word information including 

BBl)aword, 

BB2) an accumulated score and 

BB3) a beginning start frame with respect to a hypothesis representing a word end 

portion. 
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The Examiner cites numerous portions of Stevens that use words similar to words found 
in the claim language, i.e., score, matches, and hypotheses, but does not treat the subject matter 
of the claim as a whole, nor consider the relationships recited. For instance: 

A) a hypotheses of sub-words is devolved by referencing Al), A2), and A3), then 

BA) a matching is performed between BA1) and BA2), where BA2) is A) , to 

output BB) including BB1), BB2), and BB3). 

For example, the scores of paragraph 169 refer to the scores for a confusability matrix 
which relates to correcting the OO V word as disclosed in paragraph 161 and figure 20, which is 
for an embodiment of Finding Multiple Misrecognitions of Utterances in a Transcript (starting at 
paragraph 1 56). 

In another embodiment, the Examiner cites paragraph 60 which discloses a score that 
reflects a probability that a hypothesis corresponds to user speech. This scoring is an associated 
score where a higher score represents a lower probability. Further, as shown in figure 4, Stevens 
discloses that the recognizer determines whether the user was likely to have spoken the word or 
words corresponding to the hypothesis by comparing the current score for the hypothesis to a 
threshold value. If the score exceeds the threshold value, then the recognizer determines that the 
hypothesis is too unlikely to merit further consideration and deletes the hypothesis. If the 
recognizer determines that the word or words corresponding to the hypothesis were likely to 
have been spoken by the user, then the recognizer determines whether the last word of the 
hypothesis is ending. The recognizer determines that a word is ending when the frame 
corresponds to the last component of the model for the word. If the recognizer determines that a 
word is ending, the recognizer sets a flag that indicates that the next frame may correspond to the 
beginning of a word. (See paragraphs 90-93). These hypotheses are not hypotheses of sub-words 
by referencing the sub-word state tree representing the context dependent acoustic models, the word 
lexicon and the language models. 

Further, Stevens does not disclose performing matching between feature parameters of input 
speech and the developed hypotheses so as to output word information including a word, an 
accumulated score and a beginning start frame with respect to a hypothesis representing a word end 
portion. 
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The selective citation to portions of embodiments in the reference that are not directly 
related in order to find disclosure of various claim features fails to address the invention, as a 
whole and as claimed. The Applicant respectfully asks the Examiner to fully explain how 
particular parts of the cited references teach each and every limitation of the claims as recited, as 
a whole. 

Independent claims 6 and 8 are allowable for similar reasons as set forth above in 
reference to independent claim 2. 

Dependant claims 3-5 are allowable for at least the same reasons as independent claim 2 
as set forth above. 
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Conclusion 



Therefore, for at least the stated reasons, all claims are believed to be distinguishable over 
the cited references, individually or in combination. Therefore, claims 2-6 and 8 are allowable. 

In view of the above remarks and amendments, Applicant believes the pending 
application is in condition for allowance. 

Should there be any outstanding matters that need to be resolved in the present 
application, the Examiner is respectfully requested to contact Asian Ettehadieh, Reg. No. 62,278, 
at the telephone number below, to conduct an interview in an effort to expedite prosecution in 
connection with the present application. 

If necessary, the Commissioner is hereby authorized in this, concurrent, and future replies 
to charge payment or credit any overpayment to Deposit Account No. 02-2448 for any additional 
fees required under 37.C.F.R. §§1.16 or 1.14; particularly, extension of time fees. 
Dated: June 3, 2008 Respectfully submitted, 



Charles Gorenstein 
Registration No.: 29,271 

BIRCH, STEWART, KOLASCH & BIRCH, LLP 
81 10 Gatehouse Road 
Suite 100 East 
P.O. Box 747 

Falls Church, Virginia 22040-0747 
(703) 205-8000 
Attorney for Applicant 




Attachments: Sheet Nos. 1 & 2 
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